Systematic Comparison and Rational Design of Theophylline Riboswitches for Effective Gene Repression

ABSTRACT Riboswitches are promising regulatory tools in synthetic biology. To date, 25 theophylline riboswitches have been developed for regulation of gene expression in bacteria. However, no one has systematically evaluated their regulatory effects. To promote efficient selection and application of theophylline riboswitches, we examined 25 theophylline riboswitches in Escherichia coli MG1655 and found that they varied widely in terms of activation/repression ratios and expression levels in the absence of theophylline. Of the 20 riboswitches that activate gene expression, only one exhibited a high activation ratio (63.6-fold) and low expression level without theophylline. Furthermore, none of the five riboswitches that repress gene expression were more than 2.0-fold efficient. To obtain an effective repression system, we rationally designed a novel theophylline riboswitch to control a downstream gene or genes by premature transcription termination. This riboswitch allowed theophylline-dependent downregulation of the TurboRFP reporter in a dose- and time-dependent manner. Its performance profile exceeded those of previously described repressive theophylline riboswitches. We then introduced as the second part a RepA tag (protein degradation tag) coding sequence fused at the 5′-terminal end of the turborfp gene, which further reduced protein level, while not reducing the repressive effect of the riboswitch. By combining two tandem theophylline riboswitches with a RepA tag, we constructed a regulatory cassette that represses the expression of the gene(s) of interest at both the transcriptional and posttranslational levels. This regulatory cassette can be used to repress the expression of any gene of interest and represents a crucial step toward harnessing theophylline riboswitches and expanding the synthetic biology toolbox. IMPORTANCE A variety of gene expression regulation tools with significant regulatory effects are essential for the construction of complex gene circuits in synthetic biology. Riboswitches have received wide attention due to their unique biochemical, structural, and genetic properties. Here, we have not only systematically and precisely characterized the regulatory properties of previously developed theophylline riboswitches but also engineered a novel repressive theophylline riboswitch acting at the transcriptional level. By introducing coding sequences of a tandem riboswitch and a RepA protein degradation tag at the 5′ end of the reporter gene, we successfully constructed a simple and effective regulatory cassette for gene regulation. Our work provides useful biological components for the construction of synthetic biology gene circuits.


Mathematical model
To generate this model, we applied the following assumptions and simplifications: 1) The cellular environment is homogenous. Our goal is to obtain a simplified model that is more suitable for computational analysis, but avoid oversimplification that would result in a lack of biological relevance.
Although the physiological states of cells at different growth stages are different, to simplify the model and avoid introducing too many parameters, we assumed that within the scope of the model discussion, the various biological components in each cell are the same. The physiological state and gene expression of a single cell are the same in the same culture system, and the components between mother cells and daughter cells are the same. The learning parameters remain basically stable. During data preprocessing, TurboRFP fluorescence intensity (FI) referred to the FI of individual cell.

2) Theophylline enters the cell by passive diffusion. Previous studies of theophylline uptake in Escherichia
coli showed that the intracellular concentration is only 7 nM when the external concentration is 10 μM (1).
This suggests that theophylline does not have an active intracellular transport system. We assume that theophylline enters the cell by passive diffusion. Using Te to denote the extracellular theophylline concentration and Ti to denote the intracellular theophylline concentration, then: = · − · · ① = · · − · ② 3) The transcription of turborfp is tightly controlled by theophylline riboswitch. The expression of turborfp was regulated by changing the concentrations of theophylline through the theophylline riboswitch.
Promoter activity was not affected by theophylline. The changes of turborfp mRNA degradation rate due to different 5'-UTR sequences were not considered in this model.

4) There are two types of transcripts containing theophylline riboswitch: full length transcripts and
pre-terminated transcripts. The two transcripts have different rates of synthesis and degradation. The full-length transcript is 869 bp in length, containing turborfp mRNA with two theophylline riboswitches in tandem. The pre-terminated transcript is 177 bp in length and contains two theophylline riboswitches in tandem. It would take different times to synthesize them due to their different length. Since translation significantly affects mRNA degradation, we believed that degradation rates also differed between full-length transcripts and pre-terminated transcripts. Pre-terminated transcripts containing two theophylline riboswitches in tandem were divided into two types, one that binds one molecule of theophylline and the other that binds two molecules of theophylline. Their synthesis and degradation rates are the same.

5) Not all RNAs with theophylline riboswitch sequences could bind theophylline, and not all theophylline
riboswitches that bind theophylline could trigger transcriptional termination. The described phenomenon could lead to different kinds of theophylline riboswitches containing RNA. To clearly distinguish and quantitatively describe these RNAs, we introduced the partition coefficient δ. In the formula, it was subdivided into δ1 to δ5, which respectively refer to different partition coefficients. See the table below for details.

6) Different concentrations of theophylline (0-2 mM) had little effect on intracellular metabolism and cell
viability. E. coli has been reported to be unable to degrade theophylline (2). Meanwhile, the added concentration range of theophylline did not affect the growth of E. coli. We assumed that theophylline (0-2 mM) would have minimal effects on intracellular metabolism and cell viability.
The interaction between mRNA and intracellular theophylline can produce the following types of mRNA: (i) full-length mRNA. It is divided into two types: The first one is not bound by theophylline, is a major contributor to intracellular TurboRFP; the second one is bound by theophylline, and is a minor contributor to intracellular TurboRFP, an occupant of theophylline. (ii) truncated mRNA: transcription termination event generates truncated mRNA. Such mRNAs can be divided into mRNAs that bind to one copy of theophylline leading to transcription termination and mRNAs that bind to two copies of theophylline leading to transcription termination.
For the function characterizing the change of each substance by the given parameters the figure can be plotted as shown in the figure below (Hypothetical models of intracellular mRNA and protein levels). Because the overall mRNA level is also correlated with the number of bacteria, we also need to take into account the bacterial cell number, using the classical logistic growth curve of E. coli as a model parameter. The main purpose of the production of these figures is not to precisely determine the content of each substance in individual cells of E. coli, but to make a basic simulation of the trend of these substances, which can be used to further express the relationship between the three substances: theophylline, TurboRFP and time. The simulation results are in excellent agreement with the expected results, indicating that the model is reasonable.

Hypothetical models of intracellular mRNA and protein levels
The The parameters were solved by replacing t, T_i, and TurboRFP with actual experimental data using the "curve fitting" tool in MATLAB 2019b and plotted in Figure 4B. The X-axis stands for growth time, Y-axis stands for theophylline concentration, Z1-axis stands for TurboRFP FI. The correlation coefficient is 0.9755, which shows a good correlation.

Parameters Description Unit
Plasmid copy number molecules Expression levels of TurboRFP FI were measured in the absence (light blue) and presence (dark blue) of 2 mM theophylline. The numbers above the column represent activation/repression ratios. Data represent mean ± SD of 3 biological replicates. were normalized to the total protein amount, and the relative expression of TurboRFP was shown as grey bars.